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Abstract —Many state-of-art robotics applications require 
fast and efficient motion planning algorithms. Existing motion 
planning methods become less effective as the dimensionality 
of the robot and its workspace increases, especially the com¬ 
putational cost of collision detection routines. In this work, we 
present a framework to address the cost of expensive primitive 
operations in sampling-based motion planning. This framework 
determines the validity of a sample robot configuration through 
a novel combination of a Contractive AutoEncoder (CAE), 
which captures a point cloud representation of the robot’s 
workspace, and a Multilayer Perceptron, which efficiently 
predicts the collision state of the robot from the CAE and 
the robot’s configuration. We evaluate our framework on 
multiple planning problems with a variety of robots in 2D and 
3D workspaces. The results show that (1) the framework is 
computationally efficient in all investigated problems, and (2) 
the framework generalizes well to new workspaces. 

1. Introduction 

Recently, machine learning has been infused in many 
solutions to notoriously difficult problems in robotics. These 
strategies have seen a wide array of success in visual sens¬ 
ing [1], task learning [2], and human-robot cooperation [3], 
etc. It is important to understand include learning in all 
levels of a robot’s computation stack including common sub¬ 
components to solving these problems, e.g., subroutines for 
solving the motion planning problem. 

In this work, we apply machine learning to predict 
the validity of robot configurations, a common proce¬ 
dure in sampling-based motion planners. Sampling-based 
approaches have seen broad applicability to many high¬ 
dimensional and complex motion planning problems in 
robotics [4], computer graphics [5], and computational bi¬ 
ology [6]. Most of these approaches rely on the classifica¬ 
tion of robotic configurations into valid, e.g., collision-free, 
and invalid samples, while they work to construct graph 
approximations of a state space, called a roadmap. Thus, 
these approaches avoid full representation of the topology 
of the state space. However, as the dimensionality of the 
robot and the complexity of the workspace increase, the 
primitive operations of sampling-based methods become 
computationally obstructive [7]. 

Prior research has focused on improved geometric analysis 
techniques for collision checking [8], approximately model¬ 
ing state space obstacles to predict the collision status of 
robot configurations, and lazily invoking the configuration 
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validation subroutine [4], among others. While these methods 
have shown success in many applications, none have fully 
and permanently bounded the computational cost of validat¬ 
ing configurations in sampling-based motion planners. 

We propose a novel framework to efficiently and effec¬ 
tively predict the validity of robot configurations in complex 
motion planning problems. Figure 1 shows the two parts 
of our framework: (1) Offline training phase: a Contractive 
AutoEncoder (CAE) [9], is trained to embed the robot’s 
workspace into a latent space, and a Multilayer Perceptron 
(MLP), is trained to predict the validity of a robots’ configu¬ 
rations given the CAE encoding of the workspace; (2) Online 
sampling validation phase: a workspace is encoded into latent 
space by CAE and its samples are validated by MLP using 
that encoded workspace. We examine the potential impact 
of such an approach with a variety of robots in complex 
2D and 3D workspaces. For this work, we mainly focus on 
sampling because finding valid samples becomes paramount 
in narrow and cluttered passages. This work presents a 
first step towards applying our unique autoencoder variant 
to Sampling-based Motion Planning (SBMP). Our results 
indicate that: our framework is computationally efficient 
in all investigated problems, and our approach generalizes 
well to previously un-encoded workspaces. As such, our 
framework can be applied to many existing sampling-based 
routines to improve their computational efficiency. 

II. RELATED WORK 

A. Motion Planning Primitives 

Let X C denote a given state space, where d G N is the 
dimensionality of the state space. The obstacle and obstacle- 
free state spaces are defined as Xobs C X and Xfree = 
X\Xobs respectively. A path r for the robot is defined as a 
series of states {xq, ...,Xn}, where n is the number of time 
steps in the path. A path is said to be feasible if it lies entirely 
in the obstacle-free space Xf^ee 

The motion planning problem is defined as: given a state 
space X = Xfree C Xobs, an initial state Xinit ^ Xfree, and 
a goal region Xgoai ^ Xfree, find a feasible path r G Xfree 
such that Tstart = and Tend ^ Xgoai- If HO such path 

exists, report failure. 

B. Sampling-based Motion Planning (SBMP) 

A common solution to the motion planning problem is the 
utilization of sampling-based methods. Most often, sampling- 
based motion planners randomly select and progressively ex¬ 
pand and connect robot configurations to form approximate 
graph representations of Xfree, called roadmaps. A roadmap 
encodes valid states as its nodes and transitions between them 
as its edges. In order to find a path, a starting state and goal 





(a) Offline: Contractive AutoEn- 
coder 


(b) Offline: Multilayer Perceptron 
Fig. 1: Overview of Proposed Framework 


(c) Online: Sample Validation 


region are connected to the roadmap, and a feasible path is 
extracted from it. 

These techniques employ the invocation of a “black box” 
collision detection module that classifies any state in either 
Xfree ^obs to avoid explicit construction of Xobs • This is 
used in the verification of both the nodes and the edges of any 
roadmap constructed using these methodologies. Commonly 
sampling-based strategies offer guarantees on probabilistic 
completeness and asymptotic optimality [10]. 

The main paradigms of sampling-based motion planning 
are the Probabilistic RoadMap [11] and the Rapidly- 
exploring Random Tree (RRT) algorithms [12]. Both PRM 
and RRT are known to be probabilistically complete, i.e., 
if a (robust) solution exists, then a solution will almost 
surely be found as the number of samples increases [7]. 
Many proposed variations on PRM and RRT improve their 
performance for a variety of scenarios. Inoue et al. [13] 
proposed a robot path planning method that combines RRT 
and long short-term memory network, which recalls the 
path of a robot by training with a large number of paths 
generated by RRT. Zhang et.al. [14] improved RRT algo¬ 
rithm by using a target bias sampling strategy, considering 
both distance and rotation angle when choosing the nearest 
neighbor, and making the planned path as smooth as possible 
using radial basis function neural network. Hauser proposed 
Lazy-PRM* [15] a lazy collision checking strategy, which 
avoids checking connections that have no chance to improve 
upon the current best path, thus reducing the per-sample 
computational cost and accelerating solution convergence. 
This work likewise aims to improve the performance of 
sampling-based approaches — specifically through reducing 
the cost of collision detection routines. 

C. Neural Network in Motion Planning 

Representation learning [16] mainly aims to extract fea¬ 
tures from unstructured data, to either achieve a lower dimen¬ 
sional representation (often referred to as encoding) or learn 
features for supervised learning or reinforcement learning. 
There have been some recent works to utilize a learned 
low-dimensional latent model for motion planning. Ichter et 
al. [17] and Zhang et al [18] used the Conditional Variational 
Autoencoders to learn a non-uniform sampling methodology 
of a sampling-based motion planning algorithm. The work in 
[19] provides a new SBMP method using the learned latent 
space and plan motions directly in that space. Moreover, 


Chen et al. [20] proposed a time-dependent variational au¬ 
toencoder to learn dynamic motion primitives for humanoids, 
which could be applied to sampling configurations in motion 
planning. Qureshi et al. [21] learned a low-dimensional 
representation of the obstacle space by using a Contractive 
Autoencoder to encode a point cloud of the obstacles into a 
latent space and then used a feed-forward neural network to 
predict the next step an optimal planner would take given 
a start and goal. In this work, we aim to use a lower 
dimensional representation to reduce the cost of collision 
detection for use in motion planning algorithms. 

The neural network is becoming prominent in path plan¬ 
ning for its outstanding non-linear mapping ability. The 
shortcomings of neural network models include hardware 
requirements to reduce training time and difficulty of obtain¬ 
ing the best parameters for learning. However, with the fast 
development of hardware computing capabilities and unde¬ 
manding requirements of path planning, those shortcomings 
are fading steadily. Li et al. [22] addressed a motion planning 
problem for vehicles with non-linear dynamics in a clustered 
environment using near-optimal RRT, which utilizes a neural 
network to estimate the cost function considering non-linear 
kinodynamic constraints. Moreover, Shi et al. [23] presented 
a motion planning system based on hybrid deep learning, 
which uses a convolutional neural network to reduce the 
dimension of the input image, a recurrent neural network 
to create a path tracking model, and a fully connected 
neural network to construct a control model. Chen et al. [24] 
proposed an optimal robot path planning system that builds 
a map, plans optimal paths, and maneuvers mobile robots, 
in which a simplified neural network is used to calculate 
the optimal trajectory for the robot. In this work, we make 
use of the multilayer perceptron to quickly and efficiently 
determine the validity of generated samples for the motion 
planning problem. 

HI. RESEARCH FRAMEWORK 

This section introduces our proposed framework. It is 
a neural network based motion planner comprised of two 
phases. The first phase trains a Contractive AutoEncoder 
and a Multilayer Perceptrons in an offline, apriori fashion. 
The second phase performs path finding computations online 
using the trained CAE and MLR 
































A. Offline Training 

The proposed framework uses two neural network models 
to solve the motion planning problem. The first is a CAE that 
embeds a representation of the workspace, corresponding to 
a point cloud, into a latent space. The second is a MLP 
which predicts the validity of a given input sample and the 
CAE encoding of the workspace. Thus, instead of using the 
information of the whole workspace to decide whether a 
sample is valid or invalid, only the limited information from 
the encoded latent space is used. 

1) Contractive AutoEncoder (CAE): The standard con¬ 
struction of a CAE consists of training neural networks for 
the encoder and the decoder. The encoder and the decoder are 
trained together using the reconstruction error. The output of 
the encoder represents a reduced representation of the initial 
input, and the decoder reconstructs that initial input from an 
encoded representation by minimizing a cost or loss function. 

A CAE is used to embed workspace occupancy grids X 
into an invariant and robust latent space Z C where m G 
N is the dimensionality of the latent space. Let /(x, W^) be 
an encoding function with weight matrix that encodes 
an input vector x e X into a vector in the latent space 
z e Z. A decoding function g{z,W^), with weight matrix 
decodes a vector from the latent space z e Z back into 
a vector in the workspace x G X. The objective function 
(loss function) for the CAE is: 

Lcae = - 9ifix,W%W‘^)\\^+Xj2iWtjf 

xeD ij 

( 1 ) 

where A is a penalizing coefficient and D C X is the 
occupancy grid data for different workspaces. The penalizing 
term forces the latent space z = f{x, W^) to be contractive 
in the neighborhood of the training data which results in 
an invariant and robust feature learning [9]. Since we use 
linear encoder, the loss function of CAE is similar to the 
loss function of a regularized autoencoder. Moreover, our 
training data mainly consists of an unlabeled representation 
of the environment, we apply autoencoders to learn the low¬ 
dimensional representation of the training data instead of 
end-to-end architecture. By using the reconstruction (error) 
from the decoder, we could analyze a quantifiable metric 
used to decide which low-dimensional representation accu¬ 
rately replicates our environment the best. 

2) Multilayer Perceptrons (MLP): We use a multilayer 
perceptron to perform sample validity. Given a workspace 
encoding z = /(x, W^) G Z, the samples’ information, 
MLP predicts whether a sample is valid or not. The training 
objective for the MLP is to minimize the classification loss 
between the predicted sample’s label and its actual label. 
Thus we use Cross-Entropy as our loss function since it’s 
been shown to perform excellently for classification problems 
[25], [26]. 

B. Online Sample Validation and Path Planning 

The online phase exploits the neural models from the 
offline phase to do motion planning in cluttered and complex 


workspaces. Algorithm 1 presents the overall path genera¬ 
tion procedure. Eirst, using the offline-trained CAE, W is 
encoded into a latent space Z. Then, a set of samples S 
is generated based on the specification of a robot R and its 
particular workspace W. Next, the offline-trained MLP takes 
the sample set S and the encoded latent space Z as input 
to determine a set of probably valid samples S' C S. Then, 
the probably valid sample set S' is connected to create a 
roadmap by a proximity search function. If R is not found, 
additional valid samples would be generated using auxiliary 
samplers. Einally, from R, a local planning function would 
return a feasible path between a given start and goal. We 
will discuss in more detail for each step in our algorithm in 
the following sections. 


Algorithm 1 SBMP Variant 

Input New workspace X (point cloud data). Query Xinn, 

^goal 

1: Use the CAE to encode X into the latent space Z. 

2: Randomly sample a set of states, S. 

3: Eeed S and Z into MLP, which predicts the validity 
of each sample in S. After this step S' a set of most 
probably valid samples are retained. 

4: Peed S' into a standard PRM approach to yield a valid 
roadmap R. If R is not found, create additional valid 
samples using auxiliary samplers. 

5: Extract a path r from R between Xinu and Xgoai- 


1) Sample Generation: We use the following auxiliary 
samplers to generate the initial sample set S from each 
workspace. Basic PRM (BS) [11] creates samples uniformly 
and randomly. Obstacle Based PRM (OB) [27] creates 
samples near the boundary of obstacles. Gaussian Sampler 
(G) [28] also creates samples around obstacles using adaptive 
probability and collision data. Bridge Test (BT) [29] creates 
samples using a short segment with two configurations and 
their midpoint. 

2) Workspace Encoder: The encoding function/(x, lU®), 
trained in the offline phase, is used to map the workspace 
occupancy grid x G X into a latent space Z C 

3) Sample’s Validity: The MLP, a feed-forward neural 
network from the offline phase, takes the workspace encod¬ 
ing Z, the samples’ information S, and predict samples’ va¬ 
lidity to create S'. In short, MLP is a classifier distinguishing 
between valid and invalid samples. To introduce stochasticity 
into the MLP and to prevent over-fitting, some of the hidden 
units in each hidden layer of the MLP were dropped out with 
a probability p. Dropout is applied layer-wise to a neural 
network and it drops each unit in the hidden layer with a 
probability p : [0,1] G M. 

4) Connection and Path Planning: We use a straight line 
local planner. Particularly a PRM approach is applied to S' to 
generate a roadmap R, and the path is extracted and verified. 

IV. IMPLEMENTATION DETAILS 

This section gives the implementation details of our frame¬ 
work. The proposed neural models, CAE and MLP, are 




implemented in PyTorch [30]. 

A. Data Collection 

1) Workspace Data: To generate different workspace for 
training and testing, a number of quadrilateral blocks were 
placed in the operating region of 31 x 31 for 2D simple 
workspaces (2DS), 41x41x6 for 3D office-like workspaces 
(3DO), and 11 x 11 x 11 for 3D clutter workspaces (3DC). 
The placement of these blocks was randomly chosen in the 
operating region. Each random placement of the obstacle 
blocks leads to a different workspace. For each environment, 
the locations and numbers and types of obstacles are gen¬ 
erated randomly. First, the number of obstacles is chosen 
randomly in a pre-defined range. Then for each obstacle in 
that environment, the type is chosen among the pre-defined 
types (small box, big box,..) and its location is also randomly 
determined. Currently, we do not consider any rotation of the 
blocks, i.e., they are axis aligned. Figure 2, 3, and 4 show 
some examples of our workspaces. 

We represent the workspaces as an occupancy grid with 
values of —1 and 1, in which 1 means there is an obstacle 
at that point, and —1 means the point is in free space. The 
input to the encoder is an occupancy grid of size n x m for 
2D and n x m x /c for 3D, where n, m, and k are the number 
of points along each dimension of the workspace. 

We use multiple workspace representations to train and 
test our CAE. The training data set of the 2D workspaces 
comprised of 30 workspaces, and for testing, two types of 
test data sets were created to evaluate the proposed method. 
The first test data set comprised of the 30 workspaces used 
in training, and the second test data set comprised of 10 
previously unseen workspaces, i.e., 10 workspaces not from 
the training set. There are around 3 to 5 obstacle blocks with 
different shapes and sizes in each workspace. 

The training data set of the 3D office-like workspaces 
comprised of 100 workspaces, and for testing 100 known 
workspaces and 20 previously unseen workspaces with 25 
to 30 obstacle blocks in each workspace were used. 

The training data set of the 3D clutter workspaces 
comprised of 50 workspaces, and for testing 30 known 
workspaces and 10 previously unseen workspaces with 
110 to 125 same shape and size obstacle blocks in each 
workspace were used. 

To test the performance of our MLP, we generate multiple 
different workspaces with different starts and goals. For 2D 
simple and 3D office-like, we randomly generate 10 different 
workspaces with 10 different start and goal positions for each 
of them; and 100 3D clutter workspaces with 1 different start 
and goal positions for each of them to test the performance 
of our proposed framework in terms of time. 

2) Sample Data: For the 2D simple workspace, we use 
a point-mass robot with 2 degrees of freedom (DoF) that 
represents its {x, y} position. The point robot is (0.5x0.5) in 
a (31x31) environment. For the 3D office-like, we use KuKa 
robot [31], which has 5 joints, with a total of 8 DoF. The 8 
DoFs represent the robot’s {x, y} position, rotation, and each 
joint’s rotation. The Kuka robot is the size of (1.5x1x1) in a 
(41x41x6) environment. For the 3D clutter workspaces, we 



Fig. 2: Examples of 2D simple 


Fig. 3: Examples of 3D office-like 
workspace with Kuka robot 



Fig. 4: Example of 3D clutter 
workspace 

use box robots with 7 DoF and 9 DoF. The DoFs represent 
the robot’s {x^y^z} position, rotation, and each 

joint’s rotation. The box robots consist of 2 types of links: 
a small link of (0.1x0.1x0.1) and a big link of (0.4x0.1x0.1) 
in a (11x11x11) environment. The 7D box robot has 1 joint 
for 2 links, and the 9D box robot has 3 joints for 2 small 
links and 2 big links. We generate 100 samples for each 
2D workspace and 200 samples for each 3D workspace for 
training and testing our MFP separately. This is evaluated as 
seen scenarios. Those samples are approximately distributed 
50/50 between valid and invalid samples. 

Since our approach mainly focuses on verifying the sam¬ 
ple’s validity, we incorporate it into the standard PRM ap¬ 
proaches using algorithm 1 to test its effectiveness. However, 
instead of having a large set of samples 5", we provide a 
small set of samples S' and let the algorithm decide whether 
those supplied samples are enough or there is a need to gener¬ 
ate more samples. We think that providing the algorithm with 
too many samples would negatively impact the performance 
of neighborhood finding routines in sampling-based planners. 
Thus, we generate 200 samples regardless of their validity 
for each 2D simple workspace, 2000 samples for each 3D 
office-like workspace, and 400 samples for each 3D clutter 
workspace. This is evaluated as unseen scenarios 

B. Model Architecture 

1) Contractive AutoEncoder: Since the structure of the 
decoders is the inverse of the encoder, we only describe the 
structure of the encoder. 

For the 2D workspaces, the encoding function /(x, W^) 
and decoding function consist of five 

linear layers and one output layer, where each layer uses 













Parametric Rectified Linear Unit (PReLU) [32] as its acti¬ 
vation function, for loss function, we use mean square error 
with weight-decay. The layers 1 to 5 transforms the input 
vector 31 x 31 to 512, 256, 128, 64, and 32 hidden units, 
respectively. The output layer takes 32 units as input and 
outputs 12 units. Hence the workspace representation in the 
latent space is a vector of size 12. Then, those 12 units are 
used by the decoder to reconstruct the 31 x 31 space. 

For the 3D office-like workspaces, the encoding function 
f{x,W^) and decoding function are the 

same as in 2D simple workspace but consist of seven linear 
layers and one output layer. Layers one to seven transform 
the input vectors to 5043, 3125, 1600, 800, 400, 200, 100 
hidden units, respectively. The output layer takes the 100 
units from the seventh layer and transforms them into 50 
units. For the 3D clutter workspaces, layers one to six 
transform the input vectors to 1000, 800, 600, 400, 200, 100 
hidden units, the output layer transforms the input 100 units 
to output 50 units. 

2) Multi-layer Perceptrons: The input is given by con¬ 
catenating the encoded workspace’s representation Z, and 
the DoF for each robot from a given state. Each of the layers 
is a sandwich of a linear layer, a Parametric Rectified Linear 
Unit (PReLU), and Dropout (p). 

For the 2D workspaces, MLP is a 4-layer neural network. 
Layers one and two map the input vectors to 6, 4 units, re¬ 
spectively. The output layer takes the 4 units and transforms 
them into the validity of each sample, which is either valid 
or invalid. For the 3D workspaces, MLP is a 7-layer neural 
network. Layers one to six map the input vectors to 50, 40, 
30, 20, 10, 5 units, respectively. The output layer takes the 
5 units from the sixth layer and transforms them into the 
validity of each sample. 

3) Parameters: To train the neural models CAE and 
MLP for both 2D and 3D workspace, the Adagrad [33] 
optimizer was used with the learning rate of 0.1. The Dropout 
probability p and penalizing term A were set to 0.5 and 0.001, 
respectively. 


V. RESULTS 


This section presents the results of the framework for the 
motion planning of a point robot in the 2D workspaces and 
the KuKa robot and a box robot in 3D workspaces. 

Table I shows the perfor¬ 
mance for the CAE. Eor 2DS, 
the average accuracy of our 
CAE is 100% with the vari¬ 
ance of 0.4% for both already 
seen workspaces and unseen 
workspaces. Eor 3DO, the av¬ 



Seen 

Unseen 

2DS 

100% 

100% 

3DO 

96% 

94% 

3DC 

98% 

95% 


TABLE I: Performance of 
CAE 


erage accuracy of our CAE is 96% with the variance of 3% 
for already seen workspaces and the average accuracy is 94% 
with the variance of 5% for unseen workspaces. The accuracy 
and variance are 98% and 1% and 95% and 3% for seen and 
unseen 3D clutter, respectively. Overall, the accuracy of our 
CAE is excellent. Thus, we are able to learn the underlying 
structure of the workspaces quite well. 



Seen 

Unseen 

Acc 

TPR 

TNR 

Acc 

TPR 

TNR 

2DS 

92% 

96% 

88% 

88% 

90% 

85% 

3DO 

76% 

88% 

64% 

73% 

85% 

60% 

3DC-7 

75% 

87% 

64% 

71% 

85% 

61% 

3DC-9 

73% 

82% 

66% 

72% 

82% 

62% 


TABLE II: Accuracy, True Positive Rate, and True Negative 
Rate of MLP 


Table II shows the accuracy, true positive rate (TPR), and 
true negative rate (TNR) for the MLP, with false negative 
rate = 1-TPR and false positive rate = 1-TNR. Eor 2D 
workspaces, the average accuracy of our MLP is 92% with 
the variance of 2% for already seen workspaces and the 
average accuracy is 88% with the variance of 4% for unseen 
workspaces. Eor 3DO workspace, the average accuracy of 
our MLP is 76% with the variance of 6% for already 
seen workspaces, and the average accuracy is 73% with the 
variance of 8% for unseen workspaces. The average accuracy 
is 73% with 5% variance (seen), and 71% with 9% variance 
(unseen) for 3DC-7. The average accuracy is 73% with 6% 
variance (seen), and 72% with 10% variance (unseen) for 
3DC-9. Additionally, the true positive rates and true negative 
rates are quite good for our MLP, thus most of the valid 
samples and invalid samples are successfully validated by 
MLP. Overall, the accuracy of our MLP is acceptable. 



Sampling Time 

Total Time 

BS 

OB 

G 

BT 

BS 

OB 

G 

BT 

2DS 

53% 

57% 

63% 

74% 

17% 

18% 

11% 

8% 

3DO 

45% 

58% 

41% 

56% 

1% 

9% 

10% 

-8% 

3DC-7 

49% 

58% 

51% 

61% 

-2% 

14% 

-9% 

54% 

3DC-9 

39% 

59% 

48% 

62% 

18% 

20% 

13% 

37% 


TABLE III: Average time improvement ratio with and with¬ 
out applying our proposed framework for sampling time and 
total time. 

Table III show the average time improvement ratio com¬ 
parison for sampling time and total time with and without 
applying our method with Basic PRM (BS), Obstacle Based 
PRM (OB), Gaussian (G), and Bridge Test (BT) for solving 
motion planning problems. 

In 2D workspaces, the average time to validate samples 
without and with using our method are: 0.0004 ± 0.0002 
and 0.0002 ± 0.0002 for BS, 0.05 ± 0.01 and 0.03 ± 0.01 
for OB, 0.04 ± 0.01 and 0.02 ± 0.01 for G, 0.26 ± 0.23 and 
0.11 ± 0.03 for BT. There are significant time improvement 
when using our method for classify valid samples. Moreover, 
the average total execution time for each query without and 
with using our method are: 0.12 ± 0.01 and 0.10 ± 0.01 
for BS, 0.38 ± 0.05 and 0.31 ± 0.03 for OB, 0.09 ± 0.01 
and 0.09 ± 0.01 for G, 0.13 ± 0.01 and 0.13 ± 0.01 for BT. 
Notable, our framework has good performance. Additionally, 
we notice there is a nearly 20% improvement when applying 
our framework to OB. 

In 3DO workspaces, the average time to validate samples 
without and with using our method are: 0.11 ± 0.01 and 
0.06 ± 0.02 for BS, 1.13 ± 0.08 and 0.56 ± 0.25 for OB, 
0.21±0.04 and 0.12±0.02 for G, 0.76±0.13 and 0.32±0.02 
for BT. The time improvements when using our method for 








Fig. 5: Time performance with and without applying our 
method in 3DO workspaces 



Fig. 6: Time performance with and without applying our 
method in 3DC-7 workspaces 



Fig. 7: Time performance with and without applying our 
method in 3DC-9 workspaces 


classifying valid samples are quite good. Additionally, the 
average total execution time for each query without and with 
using our method are: 40.86 ± 36.47 and 40.51 ± 25.91 for 

BS, 43.74 ± 22.01 and 39.88 ± 18.72 for OB, 42.32 ± 41.99 
and 38.22±30.01 for G, 40.40±33.92 and 43.13±28.87 for 

BT. There are improvements when using our framework with 
OB (9%) and G (10%), but there is a small improvement for 
BS (1%) and a slow-down for BT (8%). We suspect that it 
could be that the supplied samples are not good enough, thus 
there is the need for re-sampling. 

The average time to validate samples without and with 
using our method for 3DC-7 are: 0.007 ±0.003 and 0.003 ± 
0.001 for BS, 0.20±0.03 and 0.09±0.01 for OB, 0.09±0.01 
and 0.04 ± 0.02 for G, 0.42 ± 0.04 and 0.17 ± 0.02 for BT; 
and for 3DC-9 are: 0.015 ± 0.005 and 0.008 ± 0.003 for 

BS, 0.08 ± 0.01 and 0.04 ± 0.01 for OB, 0.04 ± 0.01 and 
0.02±0.01 for G, 0.10±0.02 and 0.04±0.02 for BT. Overall, 
our method reduces the sampling classification time notably. 
Moreover, the average total execution time for each query 
without and with using our method for 3DC-7 are: 0.53±0.03 
and 0.54 ± 0.03 for BS, 0.65 ± 0.35 and 0.56 ± 0.13 for OB, 
0.45±0.40 and 0.49±0.12 for G, 0.95±0.64 and 0.44±0.05 
for BT; and for 3DC-9 are: 0.73 ± 0.48 and 0.60 ± 0.19 
for BS, 3.27 ± 3.62 and 2.60 ± 0.19 for OB, 3.90 ± 4.73 
and 3.40 ± 0.41 for G, 7.10 ± 0.77 and 4.50 ± 0.50 for 

BT. There are quite notable time improvements when using 


our framework to augment the motion planning algorithms 
for 3DC workspaces. In 3DC workspaces, the BT samplers 
perform the worst because of the open nature of our testing 
workspaces. 

Figure 5, 6, 7 show the performance with (BSF, OBF, GF, 
BTF) and without (BS, OB, G, BT) applying our proposed 
framework in second for 3DO, 3DC-7, 3DC-9 workspace. 
Overall, there is a notable improvement in performance 
in most cases. The standard deviations when using our 
framework decreased compared to not using it. Thus, our 
framework augments the algorithms perfectly and makes 
them more stable in solving queries reliably. Although we 
only test our approach using fixed size workspaces, it is 
robust to varying sizes. By splitting the workspace into parts 
with a pre-defined size, the proposed approach would still 
be able to validate samples. Moreover, by adjusting the 
number of samples in the sampling validation phase and the 
number of samples generated by the auxiliary sampler, our 
method guarantees probabilistic completeness for any motion 
planning problem. Therefore, our approach is applicable to 
most of sampling based planners. 

VI. Conclusion 

In this paper, we present a fast and efficient neural network 
framework for sampling-based motion planners. The frame¬ 
work consists of a Contractive AutoEncoder that encodes a 
point cloud representation of a robot’s workspace into a latent 
feature space and a feed-forward neural network that takes 
that encoded workspace and robot configuration details to 
predict the validity of that configuration. The main advantage 
of our framework is that we use an encoded workspace and 
neural network to reduce or even remove expensive collision 
detection operations. In this paper, we focus on the sampling 
phase of SBMP and how we can reduce number of collision 
checks made. Our proposed method is quite robust such that 
it can be integrated into any sampling strategy. 

In the future, we plan to research models that increase the 
accuracy of our MLP by pre-processing a configuration’s 
information before training. Furthermore, since our method 
considers the environment’s and robot’s information when 
performing collision checks, we could train them on regions 
of the environments. Thus, instead of using our method for 
the whole environment, an SBMP algorithm could use our 
method when it reaches certain potions of that environment. 
Additionally, we could apply this approach to other aspects 
of SBMP, e.g., edge prediction, and path validation. 
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